Dynamics of natural product Lupenone as a potential fusion inhibitor against the spike complex of novel Semliki Forest Virus

The Semliki Forest Virus (SFV) is an RNA virus with a positive-strand that belongs to the Togaviridae family’s Alphavirus genus. An epidemic was observed among French troops stationed in the Central African Republic, most likely caused by the SFV virus. The two transmembrane proteins El and E2 and the peripheral protein E3 make up the viral spike protein. The virus binds to the host cell and is internalized via endocytosis; endosome acidification causes the E1/E2 heterodimer to dissociate and the E1 subunits to trimerize. Lupenone was evaluated against the E1 spike protein of SFV in this study based on state-of-the-art cheminformatics approaches, including molecular docking, molecular dynamics simulation, and binding free energy calculation. The molecular docking study envisaged major interactions of Lupenone with binding cavity residues involved non-bonded van der Waal’s and Pi-alkyl interactions. Molecular dynamic simulation of a time scale 200 ns corroborated interaction pattern with molecular docking studies between Lupenone and E1 spike protein. Nevertheless, Lupenone intearcation with the E1 spike protein conforming into a stable complex substantiated by free energy landscape (FEL), PCA analysis. Free energy decomposition of the binding cavity resdiues of E1 spike protein also ensured the efficient non-bonded van der Waal’s interaction contributing most energy to interact with the Lupenone. Therefore, Lupenone interacted strongly at the active site conforming into higher structural stability throughout the dynamic evolution of the complex. Thus, this study perhaps comprehend the efficiency of Lupenone as lead molecule against SFV E1 spike protein for future therapeutic purpose.

Introduction Semliki Forest Virus (SFV) is a positive-stranded RNA virus that belongs to the Alphavirus genus of the Togaviridae family. The virus was first isolated in Semliki Forest, Uganda, from mosquitoes (Aedes aegypti) [1]. This is surprising and contradictory as this virus, which had undergone significant research, was previously regarded as one of the arboviruses that are non-hazardous to people [2]. On one side, it contains the SFV strains V13 and L10, which are the virus's most virulent strains, while on the other hand, it has the SFV strains A7 and A8, which are among the virus's most minor virulent strains. SIV strain L10 was the first known SFV strain. It is a neurovirulent virus that infects the central nervous system of the newborn mouse and causes lethal encephalitis, which leads to death (CNS). A7 infection results in the death of every newborn mouse; however, adult mice are resistant to A7 infection due to the presence of an avirulent version of the virus in their bodies-A7 infection results in the death of every newborn mouse. Oligodendrocytes, neurons, and plexus cells were the only cells in the central nervous system infected by all of the different strains. These cells were also the only cells that were infected by any of the strains. In both animals and humans, it has been linked to the transfer of disease; according to scientific studies, SFV [3] was identified as the likely source of an outbreak among French troops stationed in the Central African Republic after a case of the virus was confirmed among the soldiers. An additional fatality case had been received, and it was determined that it was linked to SFV laboratory infection [4]. Because of this, SFV is categorized as a Biosafety Level 3 virus in the United States, according to the CDC [5]. In the European Union, the virus is classified as a Biosafety Level 2 virus [6,7]. It is probable that SFV, like other members of the Alphaviruses (such as the Chikungunya virus, the O'nyong'nyong viral strain (O'nyong'nyong virus), the Sindbis virus (Sindbis virus), the Ross River virus (Ross River virus), and others, will evolve into a health danger to humans). This finding is supported by evidence from mouse models to human infections and the few known human infections.
SFV, a well-characterized alphavirus that infects cells by endocytosis and pH-dependent fusion, infects cells and causes them to become infected. SFV is a virus that has undergone substantial research in the past [7]. This enzyme is responsible for producing a total of nine functional proteins, both structural and non-structural. While a structure protein is essential in producing the capsid and envelope of a virus [8,9], a non-structural protein is involved in virus transcription and transcriptional elongation [8,9]. The activity of the structural protein also assists the formation of the viral capsid and envelope. In order to make the viral spike protein [9], two transmembrane proteins, El and E2, and a peripheral protein, E3, must be formed. The viral spike protein is composed of three components. The E2 subunit, or its intracellular precursor p62, is required for nucleocapsid binding and specific to the El protein component, forming heterodimers with other proteins [10,11]. The E2 subunit, or its intracellular precursor p62, is required for nucleocapsid binding and is specific to the El protein component. To infect and internalize the host cell, the virus must first establish contact with it. The process of endocytosis accomplishes this. Upon breaking up the E1/E2 heterodimer into its constituent parts, the E1 subunits undergo trimerization, which results in the virus being released from the cell [11]. Endosome acidification is responsible for the virus being released from the cell [12]. It has been proven in Fig 1 that the E1 trimer can release viral nucleocapsids from the cytoplasm of a cell after the fusing of the endosome and viral membrane has occurred. Based on the research conducted to date, it has been revealed that the E3 peptide is responsible for the synthesis of the N-terminal region of the p62 precursor, which is required for assembly of the complex and transport of viral structural components to the site of budding [13][14][15]. Interactions between P62 and E1, two spike glycoproteins from the influenza virus, result in forming a heterodimeric complex that is delivered to the cell surface. For viral budding to occur at the plasma membrane, it is necessary to form the E2 component at the plasma membrane, which is initially synthesized as the precursor protein p62 and then undergoes proteolytic processing to become mature E2. During the maturation process, the virus forms a heterodimer consisting of E1 and E2 (the cleavage result of p62), which is more labile and acid-sensitive than the E1-p62 combination that existed earlier in the maturation process. The fusion-active component of SFV, El, has been demonstrated in numerous investigations, regardless of whether E2 undergoes an acid-induced conformational shift or whether cleavage is required for fusion [15].
Scientists determined that the El component, which was sequenced from a variety of alphaviruses, contains a hydrophobic section that is remarkably conserved [16]. When it comes to the N terminus of E1, it has only 23 amino acids and 75 residues [17][18][19][20][21][22], which is insignificant considering that it is a giant protein. This region has around 75 residues, each of which contains one of the 23 amino acids. The term "fusion peptide" has been proposed to describe this peptide in acknowledgment of its nonpolar feature, as well as its high conservation and similarity to hydrophobic parts of spike-proteins from other viruses [16], among other characteristics. SFV spike protein cDNA clones have been mutated in vitro to modify specific amino acids within the conserved region, and the protein has been produced in COS cells to assess the protein's cell-cell fusion activity at low pH [23,24]. Using in vitro mutation to change individual amino acids within the conserved region, Kielian and co-workers developed a protein in COS cells to test the protein's cell-cell fusion ability [23,24]. Changing certain amino acids in this region resulted in a considerable increase in the efficiency of the fusion process. In light of these observations, the researchers concluded that the interior hydrophobic domain of the SFV El protein is critical in the process of membrane fusion [13].
Because new medications with pharmacological effects are constantly being discovered in the active components of plants, it is critical to identify the phytochemicals that are responsible for these discoveries. This is because new medications with pharmacological effects are discovered in the active components of plants regularly. A particular form of phytochemical, a triterpenoid is known as Lupenone, is found in abundance across the plant kingdom, particularly in the families of Asteraceae, Cactaceae, Cactusceae, Iridaceae, Musaceae, Urticaceae, Leguminosae, and Bombacaceae, among others. Numerous researches have been conducted on lupenone According to current research, it has been discovered to inhibit protein tyrosine phosphatase 1B (PTP 1B), which appears to be a viable target for novel diabetes and obesity medications [25]. In addition, Lupenone, which may be found in the plant Euphorbia segetalis, has also been shown to have a significant inhibitory effect on viral plaques of HSV-1 and HSV-2 [26] and has the potential to be a cure for a variety of DNA and RNA viruses [27]. As a result, we picked lupenone because of its antiviral properties. Furthermore, we discovered that it is a possible fusion inhibitor for SFV because of its high binding affinity for the 2ALA fusion protein (SFV E1 protein).
In this study, Lupenone is investigated for its antiviral potential against SFV. In addition, a thorough description of preclinical research conducted to evaluate the potential of Lupenone as a potential antiviral compound has been proposed in this paper.

Target preparation
The crystal structure of E1 Spike protein, 2ALA, was acquired from the Protein Data Bank's structural database. The structure was saved in.pdb format after co-crystal ligands and heteroatoms were removed. Chimera UCSF was used to minimize the structure, using the steepest descent algorithm followed by Conjugate gradient. The minimization strategies followed using AMBER 14ff forcefield followed by neutralization of any charge to 0. The ligand (Lupenone) was obtained in.sdf format from PubChem. These ligand structures were loaded into the DS visualizer and saved as.pdb files.

Virtual screening
Based on phytochemicals found from Euphorbia segatalis, screening of compounds was done initially against our target protein, 2ALA via Autodock Vina 4.2.6. Amongst these phytochemicals, we found Lupenone (Chem I.D. 92158) has the best dock score.

Molecular docking of fusion protein with Lupenone
Semliki Forest Virus belongs to the Alphavirus family, where it invades and releases its genome with the modifications of its E1-spike (PDB I.D. 2ALA); therefore we took this as our protein of interest. Here, in these docking and molecular dynamics studies, 2ALA was considered as the target protein molecule. Lupenone is the active ingredient in the leaf extract of Euphorbia segetalis, as mentioned in the results section. Molecular docking studies were carried out between the target protein E1-Spike and ligand Lupenone. The three-dimensional coordinates of E1-Spike of X-ray 3.00 Ǻ resolution in.pdb format were downloaded from the RCSB PDB repository (PDB ID: 2ALA). They were investigated for understanding the mechanism of Lupenone action. The coordinate file for E1-Spike (PDB ID: 2ALA) atomic resolution 3.00 Å was fetched from the protein databank for further analysis. 3D structure of Lupenone was procured from public database PubChem (https://www.rcsb.org/structure/2ALA) in SDF format and subsequently converted in.pdb format using OpenBabel 2.2.3 [28]. Autodock version v 4.2.1 was used for the molecular docking studies.

Molecular dynamics simulation (MD) and free energy landscape analysis
The MD simulations studies were carried in triplicate on dock complexes for 2ALA with Lupenone using the Desmond 2020.1 from Schrödinger, LLC. The triplicate samplings were made using the same parameters for each MD run to obtain the reproducibility of the results. The OPLS-2005 force field [29][30][31] and explicit solvent model with the SPC water molecules were used in this system [32]. Na+ ions were added to neutralize the charge 0.15 M, NaCl solutions were added to the system to simulate the physiological environment. Initially, the system was equilibrated using an NVT ensemble for 200 ns to retrain over the protein-Lupenone complex. Following the previous step, a short run of equilibration and minimization was carried out using an NPT ensemble for 12 ns. The NPT ensemble was set up using the Nose-Hoover chain coupling scheme [33] with the temperature at 27˚C, the relaxation time of 1.0 ps, and pressure 1 bar maintained in all the simulations. A time step of 2fs was used. The Martyna-Tuckerman-Klein chain coupling scheme [34] barostat method was used for pressure control with a relaxation time of 2 ps. The particle mesh Ewald method [35] was used to calculate long-range electrostatic interactions, and the Radius for the Coulomb interactions was fixed at 9Å. RESPA integrator was used for a time step of 2 fs for each trajectory to calculate the bonded forces. The root means square deviation (RMSD), Radius of gyration (Rg), root mean square fluctuation (RMSF), and the number of hydrogen (H-bonds) were calculated to monitor the stability of the MD simulations. The free energy landscape of protein folding on Lupenone bound complex was measured using geo_measures v 0.8 [36]. Geo_measures include a powerful library of g_sham and form the MD trajectory against RMSD and Radius of gyration (Rg) energy profile of folding recorded in a 3D plot using matplotlib python package.

Molecular Mechanics Generalized Born and Surface Area (MMGBSA) calculations
During MD simulations of 2ALA complexed with Lupenone, the binding free energy (Gbind) of docked complexes was calculated using the premier molecular mechanics generalized Born surface area (MM-GBSA) module (Schrodinger suite, LLC, New York, NY, 2017-4). The binding free energy was calculated using the OPLS 2005 force field, VSGB solvent model, and rotamer search methods [37]. After the MD run, 10 ns intervals were used to choose the MD trajectories frame The total free energy binding was calculated using Eq 1: Where, ΔGbind = binding free energy, Gcomplex = free energy of the complex, Gprotein = free energy of the target protein, and Gligand = free energy of the ligand. The MMGBSA outcome trajectories were analyzed further for post dynamics structure modifications.

Dynamic cross-correlation and principal component (PCA) analysis
During a 200-ns MD simulation, a dynamic cross-correlation matrix (DCCM) was constructed across all C-atoms for all complexes to examine domain correlations. During a 200-ns simulation of 2ALA complexed with Lupenone, PCA analysis was used to recover the global movements of the trajectories. To calculate the PCA, a covariance matrix was created as stated. For conformational analysis of the Lupenone in bound complex, 20 alternative conformational modes of the main component as movements of trajectories were calculated, and a comparison of the first highest mode (PC2) with PC10 was investigated. Geo measures v 0.8 was used to calculate the free energy landscape of protein folding on a Lupenone-bound complex [36]. The MD trajectory versus PC2 and PC10 energy folding profiles were recorded in a 3D plot using the matplotlib python package using Geo measures, which includes a comprehensive library of g_sham.

Virtual screening result
2ALA-92158 has -8.2 kcal/mol binding affinity dock score. The best molecule was further redocked at the binding cavity of 2ALA. Out of 5 screened for the receptors 2ALA Chem I. D:92158 displayed in Table 1 with the Rmsd vibrational deviation of 2.9 Å with 2ALA-92158. This virtual screening based on binding energy gave us a vivid idea of best ligand having highest affinity with 2ALA.

Molecular docking for validation of docking score
In molecular docking analysis of the E1 spike 2ALA with Lupenone in Autodock output, a dock complex displayed the best conformation. Receptors and ligands were saved in.pdbqt format for subsequent usage using the MGL 1.5.6 suite. Vina was launched from a command prompt using the command line. In the setup, the default grid point spacing was 0.525 and the exhaustiveness was set to 8. The output files were in.pdbqt format, and they were analyzed using PyMol and the Discovery studio visualizer 2021. The ligand-binding was validated and optimized using the co-crystal ligand. Both the receptor and ligands were made by combining 48 polar hydrogen bonds and detecting 1 rotatable bond and adding Kollman and Gastieger charges. Finally, both receptor and ligand molecules were stored in.pdbqt format. With the values X = -1.655, Y = 57.005, and Z = 133.83, a grid box was produced with a spacing of 0.375. Docking experiments of the protein-ligand complex were carried out using Genetic Algorithm (GA) parameters were set with 100, population size was made 200 with maximum number of evals was set to low at 2500000 and maximum generations of 27000. Further docking  (Fig 3, left).

Molecular dynamics simulation (MD)
Molecular dynamics and simulation (MD) studies were carried out to determine the stability and convergence of Lupenone bound E1 protein complex. Each simulation of 200 ns displayed stable conformation while comparing the root mean square deviation (RMSD) values. The RMSD fluctuations of apo-E1 protein were considerably higher than Lupenone bound-E1 protein, signify the bound state conforms into a more stable structure. The Cα-backbone of 2ALA bound to Lupenone exhibited a deviation of 2.9 Å (Fig 4A) RMSD plots are within the acceptable range signifying the stability of proteins in the Lupenone bound state before and after simulation. It can also be suggested that Lupenone bound with E1-Spike (PDB I.D. 2ALA) is quite stable in complex might be due to significant binding of the ligand. Significant RMSD values are due to the low resolution (3.0 Å) X-ray crystallography structure only available for spike-E1 protein in the protein data bank (https://www.rcsb.org/structure/2ala). The radius of gyration is the measure of the compactness of the protein. In Lupenone bound E1 protein displayed lowering of Radius of Gyration (Fig 4B; R1, R2, R3) as compared to the apo Spike E1. From the overall quality analysis from RMSD and Rg, it can be suggested that Lupenone bound to the protein targets posthumously in the binding cavities and plays a significant role in the stability of the proteins.
The plots for root mean square fluctuations (RMSF) displayed a significantly high RMSF in apo E1 protein at few residues as compared to Spike E1protein at the specific time function of 200 ns. Furthermore, spike E1 protein exhibited the least number of residual fluctuations while apoprotein displayed the reverse residues Val11, Lys61, Leu221, Lys241, Leu251, Val291, Lys321, Ala361, Ser371in 2ALA protein while the rest of the residues less fluctuated during the entire 200 ns simulation (Fig 4C). Therefore, for RMSF plots, it can be suggested that the proteins structures were stable during simulation in Lupenone bound conformation.
The average hydrogen bonds formed between Lupenone and the respective proteins during the 200 ns simulation were also noted and recorded in Fig 5. From 0 ns till 200 ns an average of one hydrogen bonding is found throughout the simulation and same for triplicate MD simulation of Lupenone and 2ALA (Fig 4D). Overall one hydrogen bond was formed throughout the simulation and confirmed from 2D ligand binding plot. The amount of hydrogen bonds between 2ALA and Lupenone has strengthened the binding, helping to make it more stable during the simulation. The stepwise trajectory analysis of every 25 ns of simulation of Lupenone with 2ALA tyrosine kinase displayed the positional alteration with reference to 0 ns structure (Fig 5). It has been observed that the ligand, Lupenone have possessed a structural angular movement at the end frame to achieve its conformational stability and convergence.

Free energy landscape analysis and solvent-accessible surface area (SASA)
The free energy landscape of (FEL) of achieving global minima of Cα backbone atoms of proteins with respect to RMSD and Radius of gyration (Rg) is displayed in Fig 6, A. 2ALA bound to Lupenone achieved the global minima (lowest free energy state) at 0.65 nm and Rg 3.655 nm (Fig 7A). The FEL envisaged for deterministic behavior of 2ALA to lowest energy state owing to its high stability and best conformation at Lupenone bound state. Thus, FEL is the indicator of the protein folding to attain minimum energy state, and that is aptly achieved due to Lupenone bound state.
Solvent-accessible surface area provides the information about the compactness of the protein complex on binding with ligand, here in this case the unbound protein displayed the higher SASA as marked as red as compared Lupenone bound 2ALA which happened due to the compactness of the protein in the bound stage with the ligand as depicted from Fig 7B.

Molecular Mechanics Generalized Born and Surface Area (MMGBSA) calculations
MMGBSA is a popular method in calculating the binding energy of ligand to protein molecules. The estimation of the binding free energy of each of the protein-Lupenone complexes and the role of other non-bonded interaction energies were estimated ( Table 2). The average binding energy of the ligand Lupenone with 2ALA -33.6119 kcal/mol. The ΔG bind is influenced by of various types of non-bonded interactions, including ΔG bindCoulomb , ΔG bindCovalen t, ΔG bindHbond , ΔG bindLipo , ΔG bindSolvGB and ΔG bindvdW interactions. Among all the interactions ΔGbindvdW, ΔGbindLipo and ΔGbindCoulomb energies contributed most to achieving the average binding energy. In contrast, ΔG bindSolvGB and ΔG bindCovalentenergies contributed the lowest to attain the final average binding energies. In addition, the values of ΔGbindHbond interaction of Lupenone-protein complexes showed the stable hydrogen bonds with the amino acid residues. In all the complexes ΔGb indSolvGB and ΔG bindCovalen t showed unfavorable energy contributions and thus opposed binding. It is observed from Fig 8 (left panel), at pre-simulation (0 ns) Lupenone at the binding pocket of 2ALA undergone substantial angular movement of the pose (curved to straight) after post-simulation (200 ns). These conformational changes consequences the better acquisition at the binding pocket as well as the interaction with the residues for higher stability and better binding energy.
Thus MM-GBSA calculations resulted, from MD simulation trajectories well justified with the binding energy obtained from docking results moreover, the last frame (200 ns) of MMGBSA displayed the positional change of the Lupenone as compared to 0 ns trajectory signify the better binding pose for best fitting in the binding cavity of the protein.
Therefore, it can be suggested that the Lupenone molecule has a good affinity for the major target 2ALA.

Dynamic cross-correlation matrices (DCCM), principle component analysis (PCA) & energy calculation
MD simulation trajectories are analyzed for dynamic cross-correlation among protein chains bound with Lupenone molecule domains. For correlative dynamic motion, the cross-correlation matrices of 2ALA were generated and displayed in Fig 9A. The blue blocks displayed in the figure indicated the residues having high correlated movement and red having the least

PLOS ONE
correlation. The amino acid residues of Lupenone bound 2ALA showed the concerted movement of residues ( Fig 9A). Principal component analysis (PCA) of the MD simulation trajectories for 2ALA bound to Lupenone molecule was analyzed to interpret the randomized, global motion of the atoms of amino acid residues. This analysis interprets the more flexible scattered trajectories owing the distortion of the protein structure. The internal coordinates mobility into three-dimensional space in the spatial time of 200 ns were recorded in a covariance matrix. The rational motion of each trajectory is interpreted in the form of orthogonal sets or Eigenvectors. PCA analysis of 2ALA. MD simulation trajectory Cα atoms displayed scattered unordered orientation due to their less equilibrated form in the first three modes. The first highest mode (PC1) displayed 37.34% of the trajectories having 505.65 variances with least coordinated aggregate motion 37.35. However, the PC10 described very less variance 13.05 among 0.964% trajectories with aggregated motion 84.38. The combined plots of all the three PC modes displayed in Fig 9C. It is also observed from the PCA plot as the sampling size increased to PC mode 10, the trajectories are more aligned ( Fig  9B). Therefore, PCA analysis suggested that the Eigenvectors of relative aggregated motion of the trajectories became better at higher mode PC10 into a converted global motion of the trajectories.
The energy profiles of the protein from Fig 9C, 2ALA and Lupenone complex systems were determined to display the entire system's stability (S1 Data). In this regard, the total energy (ETOT) of the 2ALA Lupenone system was shown to be very stable with an average total energy -27.00 kcal/mol (dark green). However, van der Waal's energy (vdW) displayed to be merged over the total energy with an average energy -25 kcal/mol, contemplating as a principal contributor to the stability of the 2ALA Lupenone complex (cyan). In addition, coulombic interactions played minor role in the system stability and contributing an average energy -1.00 kcal/mol (red).

Conclusion
Natural products inspired by small-molecule inhibitors' low toxicity profile may prove to be a great asset during the frenetic development period of drug discovery when time is of the essence. Current, state-of-the-art computational approaches can be used to identify structural and pharmacophoric properties of active natural products that can be used as fusion inhibitor against SFV. In conclusion, this study reveals the potential of a naturally available molecule Lupenone, to bind in the active site of SFV spike protein in a highly specific binding pattern. As we have seen recently in the case of Covid-19, all of a sudden it emerges as a virus capable of causing pandemic scale infections and uncountable deaths. Although there are very few reported cases against SFV for human infection, different studies on the rodent models have evidently proved the potential of SFV as a global threat. According to the earlier report, it was suggested that there aren't many medications or pharmacological candidates available to treat SFV to date. Therefore, scientists worldwide should be encouraged to advance our understanding of SFV pathophysiology and the production of antivirals aimed explicitly at SFV. According to the researchers, improved computational and molecular modeling methods and analytical and experimental procedures should be used to maximize the activity of naturally occurring chemicals that have already been identified. Our present study perhaps supports a further strategy for in-vivo and clinical trial of Lupenone to prevent Semliki Forest Virus pathogenicity and any future scope of the global pandemic condition.